Nearest Neighbor Searching in Metric Spaces: Experimental Results for sb(S)
نویسنده
چکیده
Given a set S of n sites (points), and a distance measure d, the nearest neighbor searching problem is to build a data structure so that given a query point q, the site nearest to q can be found quickly. This paper gives a data structure for this problem; the data structure is built using the distance function as a “black box”. The structure is able to speed up nearest neighbor searching in a variety of settings, for example: points in low-dimensional or structured Euclidean space, strings under Hamming and edit distance, and bit vector data from an OCR application. The data structures are observed to need linear space, with a modest constant factor. The preprocessing time needed per site is observed to match the query time. The data structure can be viewed as an application of a “kd-tree” approach in the metric space setting, using Voronoi regions of a subset in place of axis-aligned boxes.
منابع مشابه
Nearest-Neighbor Searching and Metric Space Dimensions
Given a set S of points in a metric space with distance function D, the nearest-neighbor searching problem is to build a data structure for S so that for an input query point q, the point s ∈ S that minimizes D(s, q) can be found quickly. We survey approaches to this problem, and its relation to concepts of metric space dimension. Several measures of dimension can be estimated using nearest-nei...
متن کاملNon-zero probability of nearest neighbor searching
Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...
متن کاملSpace-Time Tradeoffs for Proximity Searching in Doubling Spaces
We consider approximate nearest neighbor searching in metric spaces of constant doubling dimension. More formally, we are given a set S of n points and an error bound ε > 0. The objective is to build a data structure so that given any query point q in the space, it is possible to efficiently determine a point of S whose distance from q is within a factor of (1 + ε) of the distance between q and...
متن کاملUsing the k-Nearest Neighbor Graph for Proximity Searching in Metric Spaces
Proximity searching consists in retrieving from a database, objects that are close to a query. For this type of searching problem, the most general model is the metric space, where proximity is defined in terms of a distance function. A solution for this problem consists in building an offline index to quickly satisfy online queries. The ultimate goal is to use as few distance computations as p...
متن کاملThe Analysis of a Probabilistic Approach to Nearest Neighbor Searching
Given a set S of n data points in some metric space. Given a query point q in this space, a nearest neighbor query asks for the nearest point of S to q. Throughout we will assume that the space is real d-dimensional space <d, and the metric is Euclidean distance. The goal is to preprocess S into a data structure so that such queries can be answered efficiently. Nearest neighbor searching has ap...
متن کامل